AITopics | lower bound

Collaborating Authors

lower bound

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Instance-Optimal Estimation with Multiple LLM Judges on a Budget

Lee, Junghyun, Kim, Sanghwa, Jedra, Yassir, Proutière, Alexandre, Yun, Se-Young

arXiv.org Machine LearningMay-25-2026

Evaluating large language models increasingly relies on LLM-as-a-judge protocols, but such evaluations remain costly: different judges have different prices and reliabilities, and the difficulty of each prompt-response pair can vary substantially. This raises a basic allocation question: under a fixed budget, how should one distribute evaluation queries across heterogeneous judges and instances to obtain the most accurate score estimates? We formalize this question as *budgeted heteroskedastic multi-judge estimation*. Given $K$ prompt-response pairs, $J$ judges with known costs, and unknown query-judge variances, the goal is to estimate a bounded score vector while minimizing an $\ell_p$-error. Our first contribution is to analyze the inverse-variance weighted estimator (IVWE) and to derive the oracle allocation that minimizes its error rate. Since this allocation depends on the unknown variances, we then address the practical unknown-variance setting by proposing EST-IVWE, an adaptive algorithm that constructs and leverages *optimistically biased* variance estimates to stabilize the empirical allocation. We prove that EST-IVWE matches the oracle IVWE rate up to lower-order terms in the budget. Our second and central theoretical contribution is a matching *local* minimax lower bound, which establishes the instance-optimality of the proposed algorithms. A key technical insight is that Fano-type high-probability arguments are too coarse for this problem: their packing construction loses the local variance structure that governs the optimal allocation. We instead use an Assouad-type in-expectation argument, based on local perturbations, which preserves this structure and yields the sharp allocation-dependent lower bound. Finally, we numerically validate the superiority of our approach over naïve uniform allocation on synthetic and HelpSteer2 datasets.

large language model, machine learning, natural language, (19 more...)

arXiv.org Machine Learning

2605.23362

Country: North America > United States > New York (0.28)

Genre: Research Report > New Finding (0.92)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

On the Learning Curves of Revenue Maximization

Hanneke, Steve, Kalavasis, Alkis, Moran, Shay, Velegkas, Grigoris

arXiv.org Machine LearningApr-30-2026

Learning curves are a fundamental primitive in supervised learning, describing how an algorithm's performance improves with more data and providing a quantitative measure of its generalization ability. Formally, a learning curve plots the decay of an algorithm's error for a fixed underlying distribution as a function of the number of training samples. Prior work on revenue-maximizing learning algorithms, starting with the seminal work of Cole and Roughgarden [STOC, 2014], adopts a distribution-free perspective, which parallels the PAC learning framework in learning theory. This approach evaluates performance against the hardest possible sequence of valuation distributions, one for each sample size, effectively defining the upper envelope of learning curves over all possible distributions, thus leading to error bounds that do not capture the shape of the learning curves. In this work we initiate the study of learning curves for revenue maximization and provide a near-complete characterization of their rate of decay in the basic setting of a single item and a single buyer. In the absence of any restriction on the valuation distribution, we show that there exists a Bayes-consistent algorithm, meaning that its learning curve converges to zero for any arbitrary valuation distribution as the number of samples $n \to \infty$. However, this convergence must be arbitrarily slow, even if the optimal revenue is finite. In contrast, if the optimal revenue is achieved by a finite price, then the optimal rate of decay is roughly $1/\sqrt{n}$. Finally, for distributions supported on discrete sets of values, we show that learning curves decay almost exponentially fast, a rate unattainable under the PAC framework.

algorithm, artificial intelligence, machine learning, (17 more...)

arXiv.org Machine Learning

2604.26922

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Computational Learning Theory (0.86)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.67)

Add feedback

Refined Lower Bounds for Adversarial Bandits

Sébastien Gerchinovitz, Tor Lattimore

Neural Information Processing SystemsMar-23-2026, 05:12:03 GMT

We provide new lower bounds on the regret that must be suffered by adversarial bandit algorithms. The new results show that recent upper bounds that either (a) hold with high-probability or (b) depend on the total loss of the best arm or (c) depend on the quadratic variation of the losses, are close to tight. Besides this we prove two impossibility results. First, the existence of a single arm that is optimal in every round cannot improve the regret in the worst case.

artificial intelligence, data mining, machine learning, (17 more...)

Neural Information Processing Systems

Country:

Europe (0.68)
North America > Canada > Alberta (0.28)

Genre: Research Report > New Finding (0.86)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (0.96)
Information Technology > Data Science > Data Mining > Big Data (0.69)

Add feedback

Optimal Query Complexities for Dynamic Trace Estimation

David P. Woodruff, Carnegie Mellon University, dwoodruf@cs.cmu.edu "3026 Fred Zhang, UC Berkeley, z0@berkeley.edu, "3026 Qiuyi (Richard) Zhang, Google Brain, qiuyiz@google.com

Neural Information Processing SystemsFeb-12-2026, 11:18:01 GMT

Inther valued, forsuf"andanyp2[1,2], p log ( 1/ )/" p number isnecessary"kAkp errorwith1 .

artificial intelligence, hutchinson, theorem 5, (12 more...)

Neural Information Processing Systems

Country:

North America > United States > Texas > Brazos County > College Station (0.04)
North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)

Technology: Information Technology > Artificial Intelligence (0.48)

Add feedback

Appendices

Neural Information Processing SystemsFeb-11-2026, 03:03:22 GMT

Let N(µ,σ2) denote a Gaussian distribution with meanµ and variance σ2. Let χ2(n) denote a χ2 distribution withn degrees of freedom. Our analysis extensively uses the following facts about Gaussian and χ2 distributions: Definition A.1 (Gaussian and Wigner Random Matrices). We let G N(n) denote an n n randomGaussianmatrixwith i.i.d. We let W W(n)=G+GT denotean n n Wigner matrix, where G N(n). Fact A.1 (χ2 TailBound(Lemma 1of[1])).

artificial intelligence, probability, probability 1, (16 more...)

Neural Information Processing Systems

Country:

North America > United States > California > Santa Clara County > Palo Alto (0.04)
North America > United States > New York > New York County > New York City (0.04)

Technology: Information Technology > Artificial Intelligence (0.70)

Add feedback

1 2 Related Work 3 3 Setup 4 3.1 Objective and Error Decomposition 4 4 Algorithms 5 5 Main Result 6 6 Proof Sketch 7 6.1 Offline Uncertainty 7 6.2 Online Uncertainty

Neural Information Processing SystemsFeb-10-2026, 23:38:10 GMT

In this section we discuss how to adapt existing linear bandit results (mostly in the regret setting) to sample complexity results.

artificial intelligence, data mining, machine learning, (18 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Machine Learning (0.49)
Information Technology > Data Science > Data Mining (0.47)

Add feedback

32c6d65ec2591dfcfb3f0e345a51f585-Supplemental-Conference.pdf

Neural Information Processing SystemsFeb-9-2026, 20:46:32 GMT

artificial intelligence, machine learning, theorem 3, (17 more...)

Neural Information Processing Systems

Country:

North America > United States > Massachusetts > Middlesex County > Cambridge (0.14)
North America > Canada > Ontario > Toronto (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Asia > Japan > Honshū > Kantō > Kanagawa Prefecture (0.04)

Genre: Research Report > New Finding (0.67)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.67)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.46)

Add feedback

Optimal Lower Bounds for Online Multicalibration

Collina, Natalie, Lu, Jiuyao, Noarov, Georgy, Roth, Aaron

arXiv.org Machine LearningJan-9-2026

We prove tight lower bounds for online multicalibration, establishing an information-theoretic separation from marginal calibration. In the general setting where group functions can depend on both context and the learner's predictions, we prove an $Ω(T^{2/3})$ lower bound on expected multicalibration error using just three disjoint binary groups. This matches the upper bounds of Noarov et al. (2025) up to logarithmic factors and exceeds the $O(T^{2/3-\varepsilon})$ upper bound for marginal calibration (Dagan et al., 2025), thereby separating the two problems. We then turn to lower bounds for the more difficult case of group functions that may depend on context but not on the learner's predictions. In this case, we establish an $\widetildeΩ(T^{2/3})$ lower bound for online multicalibration via a $Θ(T)$-sized group family constructed using orthogonal function systems, again matching upper bounds up to logarithmic factors.

artificial intelligence, machine learning, prediction, (18 more...)

arXiv.org Machine Learning

2601.05245

Country: North America > United States (0.27)

Genre: Research Report (0.81)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

A Lower Bound of Hash Codes' Performance

Neural Information Processing SystemsDec-25-2025, 03:12:41 GMT

As a crucial approach for compact representation learning, hashing has achieved great success in effectiveness and efficiency. Numerous heuristic Hamming space metric learning objectives are designed to obtain high-quality hash codes. Nevertheless, a theoretical analysis of criteria for learning good hash codes remains largely unexploited. In this paper, we prove that inter-class distinctiveness and intra-class compactness among hash codes determine the lower bound of hash codes' performance. Promoting these two characteristics could lift the bound and improve hash learning. We then propose a surrogate model to fully exploit the above objective by estimating the posterior of hash codes and controlling it, which results in a low-bias optimization. Extensive experiments reveal the effectiveness of the proposed method. By testing on a series of hash-models, we obtain performance improvements among all of them, with an up to $26.5\%$ increase in mean Average Precision and an up to $20.5\%$ increase in accuracy. Our code is publicly available at https://github.com/VL-Group/LBHash.

hash code, lower bound, name change, (4 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Lower Bounds on Adversarial Robustness from Optimal Transport

Neural Information Processing SystemsDec-24-2025, 23:13:27 GMT

While progress has been made in understanding the robustness of machine learning classifiers to test-time adversaries (evasion attacks), fundamental questions remain unresolved. In this paper, we use optimal transport to characterize the maximum achievable accuracy in an adversarial classification scenario. In this setting, an adversary receives a random labeled example from one of two classes, perturbs the example subject to a neighborhood constraint, and presents the modified example to the classifier. We define an appropriate cost function such that the minimum transportation cost between the distributions of the two classes determines the \emph{minimum $0-1$ loss for any classifier}. When the classifier comes from a restricted hypothesis class, the optimal transportation cost provides a lower bound. We apply our framework to the case of Gaussian data with norm-bounded adversaries and explicitly show matching bounds for the classification and transport problems and the optimality of linear classifiers. We also characterize the sample complexity of learning in this setting, deriving and extending previously known results as a special case. Finally, we use our framework to study the gap between the optimal classification performance possible and that currently achieved by state-of-the-art robustly trained neural networks for datasets of interest, namely, MNIST, Fashion MNIST and CIFAR-10.

adversarial robustness, lower bound, name change, (5 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback